A C L 2 0 0 7 The LAW Proceedings of The Linguistic Annotation Workshop
نویسندگان
چکیده
In this paper we describe the Graph Annotation Format (GrAF) and show how it is used represent not only independent linguistic annotations, but also sets of merged annotations as a single graph. To demonstrate this, we have automatically transduced several different annotations of the Wall Street Journal corpus into GrAF and show how the annotations can then be merged, analyzed, and visualized using standard graph algorithms and tools. We also discuss how, as a standard graph representation, it allows for the application of well-established graph traversal and analysis algorithms to produce information about interactions and commonalities among merged annotations. GrAF is an extension of the Linguistic Annotation Framework (LAF) (Ide and Romary, 2004, 2006) developed within ISO TC37 SC4 and as such, implements state-of-the-art best practice guidelines for representing linguistic annotations.
منابع مشابه
PRAGUE The Association for Computational Linguistics A C L 2 0 0 7 Proceedings of the Second Workshop on Statistical Machine Translation
متن کامل
A C L 2 0 0 7 Proceedings of the ACL - PASCAL Workshop on Textual Entailment and Paraphrasing June 28 - 29 , 2007 Prague , Czech Republic
متن کامل
PRAGUE The Association for Computational Linguistics A C L 2 0 0 7 Proceedings of the Second Workshop on Statistical Machine Translation
متن کامل
By all these lovely tokens... Merging Conflicting Tokenizations
Given the contemporary trend to modular NLP architectures and multiple annotation frameworks, the existence of concurrent tokenizations of the same text represents a pervasive problem in everyday’s NLP practice and poses a non-trivial theoretical problem to the integration of linguistic annotations and their interpretability in general. This paper describes a solution for integrating different ...
متن کامل